X-1462-1P US 



PATENT 



TUNING PROGRAMMABLE LOGIC DEVICES FOR LOW- POWER DESIGN 

IMPLEMENTATION 

Tim Tuan 
Jan L. de Jong 
Kameswara K. Rao 
Robert 0. Conn 

Related Application 

[0001] The present application is a continuation-in-part of 
U.S. Patent Application Serial No. 10/666,669 filed by Tim Tuan/ 
Kameswara K. Rao and Robert 0. Conn on September 19, 2003, which 
is incorporated herein in its entirety. 

FIELD OF THE INVENTION 

[0002] The present invention relates to the regulation of the 
supply voltage provided to unused and/or inactive blocks in a 
programmable logic device to achieve lower power consumption. 
More specifically, the present invention relates to selectively 
reducing the operating voltage of various sections of an 
integrated circuit device in order to reduce the leakage current 
and/or increase the performance of the device. 

RELATED ART 

[0003] Programmable logic devices (PLDs) , such as field 
programmable gate arrays (FPGAs), have a significantly higher 
static power consumption than dedicated logic devices, such as 
standard-cell application specific integrated circuits (ASICs) . 
A reason for this high static power consumption is that for any 
given design, a PLD only uses a subset of the available 
resources. The unused resources are necessary for providing 
greater mapping flexibility to the PLD. However, these unused 
resources still consume static power in the form of leakage 
current. Consequently, PLDs are generally less likely to be 
used in applications where low static power is required. 
[0004] It would therefore be desirable to have a PLD having a 
reduced static power consumption. 
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[0005] Programmable logic devices (PLDs) also have a 
significantly higher dynamic power consumption than dedicated 
logic, devices because the PLD resources (logic and routing) are 
designed with a fixed level of performance, regardless of the 
requirements of the specific application being implemented by 
the PLD. Most PLD applications do not require the maximum 
hardware speed for some (or even all) parts of the PLD. As a 
result, " timing slack" exists in different parts of the PLD. In 
fact, the timing critical part of a PLD design typically 
represents a very small portion of the whole design. In circuit 
design, higher speed circuits generally consume more power, both 
dynamic and static. Consequently, the parts of the PLD that are 
not operated at the maximum hardware speed represent an 
inefficient use of power. 

[0006] It would therefore be desirable to improve the power 
efficiency of a programmable logic device by taking advantage of 
the timing slack present in different parts of a PLD design. 

SUMMARY 

[0007] In accordance with one embodiment of the present 
invention, unused and/or inactive resources in a PLD are 
disabled to achieve lower power consumption. 
[0008] One embodiment of the present invention provides a 
method of operating a PLD, which includes the steps of enabling 
the resources of the PLD that are used in a particular circuit 
design, and disabling the resources of the PLD that are unused 
or inactive. The step of disabling can include de-coupling the 
unused or inactive resources from one or more power supply 
terminals. Alternately, the step of disabling can include 
regulating (e.g., reducing) a supply voltage applied to the 
unused or inactive resources. 

[0009] In accordance with one embodiment, the step of 
disabling can be performed in response to configuration data 
bits stored by the PLD. These configuration data bits can be 
determined during the design of the circuit to be implemented by 
the PLD. That is, during the design, the design software is 
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able to identify unused resources of the PLD, and select the 
configuration data bits to disable these unused resources. 
[0010] The step of disabling can also be performed in 
response to user-controlled signals. These user-controlled 
signals can be generated in response to observable operating 
conditions of the PLD. For example, if certain resources of the 
operating PLD are inactive for a predetermined time period, then 
the user-controlled signals may be activated, thereby causing 
the inactive resources to be disabled. 

[0011] In accordance with another embodiment, a PLD includes 
a first voltage supply terminal that receives a first supply 
voltage, a plurality of programmable logic blocks, and a 
plurality of switch elements, wherein each switch element is 
coupled between one of the programmable logic blocks and the 
first voltage supply terminal. A control circuit coupled to the 
switch elements provides a plurality of control signals that 
selectively enable or disable the switch elements. The control 
circuit can be controlled by a plurality of configuration data 
values stored by the PLD and/or a plurality of user-controlled 
signals . 

[0012] In an alternate embodiment, each of the switch 
elements can be replaced by a switching regulator. In this 
embodiment, the operating voltage applied to different blocks of 
the PLD may be adjusted in view of the timing slack available in 
these blocks. That is, a block with a large amount of timing 
slack can be operated at a lower voltage, thereby causing the 
block to operate at a slower speed, which is acceptable within 
the parameters of the PLD design. The lower operating voltage 
advantageously reduces the leakage current in the block. Blocks 
with a small amount of timing slack are operated at a higher 
voltage, thereby enabling these blocks to operate at the 
required high speed. 

[0013] In accordance with one embodiment, the switching 
regulator can be a high-voltage n-channel transistor having a 
drain coupled to the V DD voltage supply and a source coupled to 
the programmable logic block. The gate of the high voltage 
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transistor is coupled to receive a control voltage from a 
corresponding control circuit. The control circuit determines 
whether the corresponding programmable logic block is in an 
active or inactive state in response to user controlled signals 
and/or configuration data bits. When the programmable logic 
block is active, the control circuit applies a high control 
voltage V B00ST , which is greater than the V DD supply voltage, to 
the gate of the high voltage transistor, such that the full V, 
supply voltage is applied to the programmable logic block. When 
the programmable logic block is inactive, the control circuit 
applies a low control voltage V STANDBY , which is less than the V DD 
supply voltage, to the gate of the high voltage transistor, such 
that a voltage of about one half the V DD supply voltage is 
applied to the programmable logic block. A feedback mechanism 
can be employed to ensure that the voltage applied to the 
programmable logic block is precisely equal to one half the V DD 
supply voltage. 

[0014] In accordance with another embodiment, a method of 
operating a programmable logic device includes the steps of 
using a full V DD supply voltage to operate a first set of active 
blocks of the programmable logic device, and using a reduced 
supply voltage (e.g., 0.9 V DD ) to operate a second set of active 
blocks of the programmable logic device. A timing analysis is 
performed during design time and/or run time, in order to 
determine the maximum available timing slack in each active 
block. Active blocks having a relatively small timing slack are 
grouped in the first set, and are coupled to receive the full V DD 
supply voltage. As a result, the active blocks in the first set 
receive a voltage high enough to enable these blocks to meet the 
timing requirements of the PLD design. 

[0015] Active blocks having a relatively large timing slack 
are grouped in the second set, and are coupled to receive the 
reduced V DD supply voltage. As a result, the active blocks in 
the second set exhibit reduced power consumption (as a result of 
operating in response to the reduced V DD supply voltage) . In 
addition, the active blocks in the second set meet the timing 
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requirements of the PLD design, in spite of operating in 
response to the reduced V DD supply voltage, because of the large 
timing slack initially present in these blocks. As a result, 
operating the active blocks in the second set at the reduced V DD 
supply voltage does not adversely affect the overall speed of 
the programmable logic device. 

[0016] The reduced V DD supply voltage can be supplied in 
various manners, including, but not limited to, variable voltage 
switching regulators, or a separate voltage supply. The 
application of the full V DD voltage supply or the reduced V DD 
voltage supply can be controlled by configuration data bits 
and/or user control signals. 

[0017] The present invention will be more fully understood in 
view of the following description and drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0018] Fig. 1 is a flow diagram illustrating a conventional 
design flow used for PLDs. 

[0019] Fig. 2 is a flow diagram illustrating a design flow 
for a PLD in accordance with one embodiment of the present 
invention. 

[0020] Fig. 3 is a block diagram of a conventional PLD having 
four blocks, which are all powered by the same off -chip V DD 
voltage supply. 

[0021] Fig. 4 is a block diagram of a PLD that implements 
power-gating switch elements in accordance with one embodiment 
of the present invention. 

[0022] Fig. 5 is a block diagram of a PLD that implements 
switching regulators in accordance with one embodiment of the 
present invention . 

[0023] Fig. 6 is a block diagram of the PLD of Fig. 5, which 
shows switching regulators in accordance with one embodiment of 
the present invention. 

[0024] Fig. 7 is a flow diagram illustrating a design flow in 
accordance with the operating voltage tuning embodiment of the 
present invention . 
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[0025] Fig. 8 is a block diagram of a PLD in accordance with 
a voltage tuning embodiment of the invention. 
[0026] Fig. 9 is a block diagram of a PLD that implements 
variable voltage switching regulators in accordance with one 
embodiment of the invention. 

[0027] Fig. 10 is a circuit diagram of a level-shifting flip- 
flop for use in various embodiments of the present invention. 

DETAILED DESCRIPTION 

[0028] In accordance with one embodiment of the present 
invention, unused and inactive resources in a programmable logic 
device (PLD) , such as a field programmable gate array (FPGA) , 
are disabled to achieve lower static power consumption. The 
present invention includes both an enabling software flow and an 
enabling hardware architecture, which are described in more 
detail below. Unused resources of the PLD can be disabled when 
designing a particular circuit to be implemented by the PLD 
(hereinafter referred to as "design time"). In addition, 
resources of the PLD that are temporarily inactive can be 
disabled during operation of the PLD (hereinafter referred to as 
"run time" ) . 

[0029] Fig. 1 is a flow diagram 100 illustrating a 
conventional design flow used for PLDs . Initially, a user 
designs a circuit to be implemented by the PLD (Step 101) . This 
user design is described in a high-level specification, such as 
Verilog or VHDL. The high-level specification is first 
synthesized to basic logic cells available on the PLD (Step 
102) . A place and route process then assigns every logic cell 
and wire in the design to some physical resource in the PLD 
(Step 103) . The design is then converted into a configuration 
bit stream, in a manner known to those of ordinary skill in the 
art (Step 104) . The configuration bit stream is then used to 
configure the device by setting various on-chip configuration 
memory cells (Step 105) . While modern design flows may be much 
more complex, they all involve the basic steps defined by flow 
diagram 100. 
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[0030] In accordance with the present invention, unused 
resources of the PLD are identified during the design time, 
following the place and route process (Step 103) . These unused 
resources are then selectively disabled during the design time. 
As described below, there are several ways to disable the unused 
resources. By selectively disabling the unused resources at 
design time, significant static power reduction may be achieved 
with no performance penalty. 

[0031] Fig. 2 is a flow diagram 200 illustrating a design 
flow in accordance with one embodiment of the present invention. 
Similar steps in flow diagrams 100 and 2 00 are labeled with 
similar reference numbers. Thus, flow diagram 200 includes 
Steps 101-105 of flow diagram 100, which are described above. 
In addition, flow diagram 200 includes the step of disabling 
unused resources in the PLD (Step 201) . This step of disabling 
unused resources is performed after the place and route process 
has been completed in Step 103, and before the configuration bit 
stream is generated in Step 104. As described in more detail 
below, the unused resources are disabled by disabling 
predetermined programmable logic blocks of the PLD. 
[0032] In another embodiment, further power savings are 
obtained by disabling temporarily inactive resources of the 
configured PLD during run time. Often, the entire design or 
parts of the design are temporarily inactive for some period of 
time. If the inactive period is sufficiently long, it is 
worthwhile to disable the inactive resources to reduce static 
power consumption. In a preferred embodiment, the decision of 
when to disable a temporarily inactive resource is made by the # 
designer. In this embodiment, the user logic is provided access 
to a disabling mechanism, which enables the inactive resources 
to be disabled dynamically. 

[0033] There are a number of techniques to disable resources 
in a PLD. In accordance with one embodiment, the PLD is 
logically subdivided into a plurality of separate programmable 
logic blocks. As described below, each programmable logic block 
may comprise one or more of the resources available on the 
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programmable logic device. Switch elements are used to couple 
each of the programmable logic blocks to one or more associated 
voltage supply terminals (e.g., V DD or ground). The switch 
elements are controlled to perform a power-gating function, 
wherein unused and/or inactive programmable logic blocks are 
disabled (e.g., prevented from receiving power or receiving a 
reduced power) . Preferably, only one of the voltage supply 
terminals (V DD or ground) is power-gated, thereby reducing the 
speed and area penalties associated with the switch elements. 
When the switch elements are controlled to de-couple the 
associated programmable logic blocks from the associated supply 
voltage, these programmable logic blocks are effectively 
disabled, thereby dramatically reducing the static power 
consumption of these blocks. 

[0034] Fig. 3 is a block diagram of a conventional PLD 300 
having four programmable logic blocks 301-304, which are all 
powered by the same off-chip V DD voltage supply 305. Note that 
all four programmable logic blocks 301-304 are coupled to 
receive the V DD supply voltage during normal operating 
conditions, even if some of these blocks are not used in the 
circuit design. 

[0035] Fig. 4 is a block diagram of a PLD 400 in accordance . 
with one embodiment of the present invention. Similar elements 
in Figs. 3 and 4 are labeled with similar reference numbers. 
Thus, PLD 400 includes programmable logic blocks 301-304 and V DD 
voltage supply 305. In addition, PLD 400 includes switch 
elements 401-408, and control circuit 409. In the described 
embodiment, switch elements 401-404 are implemented by PMOS 
power-gating transistors 451-454, respectively, and switch 
elements 405-408 are implemented by NMOS power-gating 
transistors 455-458, respectively. In other embodiments, switch 
elements 401-408 may be any switch known to those ordinarily 
skilled in the art. Control circuit 409 is implemented by 
inverters 411-414, NOR gates 421-424, configuration memory cells 
431-434, and user logic input terminals 441-444. 
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[0036] NOR gates 421-424 and inverters 411-414 are configured 
to generate power-gating control signals SLEEP 1 -SLEEP 4 and 
SLEEP# 1 -SLEEP# 4 in response to the configuration data values CD^ 
CD 4 stored in configuration memory cells 431-434, respectively, 
and the user control signals UC X -UC 4 provided on user logic input 
terminals 441-444, respectively. 

[0037] For example, NOR gate 421 is coupled to receive 
configuration data value CD X from configuration memory cell 431 
and user control signal XJC 1 from user logic input terminal 441. 
If either the configuration data value CD X or the user control 
signal UC 1 is activated to a logic high state, then NOR gate 421 
provides an output signal (SLEEP^) having a logic w 0" state. 
In response, inverter 411, which is coupled to the output 
terminal of NOR gate 421, provides an output signal (SLEEP X ) 
having a logic w l" state. 

[0038] The SLEEP 1 signal is applied to the gate of PMOS 
power-gating transistor 451, which is coupled between block 301 
and the V DD voltage supply terminal. The SLEEP# 1 signal is 
applied to the gate of NMOS power-gating transistor 455, which 
is coupled between block 301 and the ground voltage supply 
terminal. The logic "0" state of the SLEEP# X signal causes NMOS 
power-gating transistor 455 to turn off, thereby de-coupling 
block 301 from the ground supply voltage terminal. Similarly, 
the logic "1" state of the SLEEP X signal causes PMOS power- 
gating transistor 451 to turn off, thereby de-coupling block 301 
from the V DD supply voltage terminal. De-coupling block 301 from 
the V DD and ground supply voltage terminals effectively disables 
block 3 01, thereby minimizing the static leakage current in this 
block. 

[0039] If both the configuration data value CD 1 and the user 
control signal UC X are de-activated to a logic low state, then 
NOR gate 421 provides a SLEEP# X signal having a logic u l" state, 
and inverter 411 provides a SLEEP 1 signal having a logic "0" 
state. The logic "1" state of the SLEEP# 1 signal causes NMOS 
power-gating transistor 455 to turn on, thereby coupling block 
301 to the ground supply voltage terminal. Similarly, the logic 
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u 0" state of the SLEEP 1 signal causes PMOS power-gating 
transistor 451 to turn on, thereby coupling block 301 to the V DD 
supply voltage terminal. Coupling block 301 to the V DD and 
ground supply voltage terminals effectively enables block 301. 
[0040] Programmable logic block 302 may be enabled and 
disabled in response to configuration data value CD 2 and user 
control signal UC 2 , in the same manner as block 301. Similarly, 
programmable logic block 3 03 may be enabled and disabled in 
response to configuration data value CD 3 and user control signal 
UC 3 , in the same manner as block 301. Programmable logic block ' 
304 may be enabled and disabled in response to configuration 
data value CD 4 and user control signal UC 4 , in the same manner as 
block 301. 

[0041] As described above, when a programmable logic block is 
used and active, the associated power-gating transistors are 
turned on. Conversely, when a programmable logic block is 
unused or inactive, the associated power gating transistors are t 
turned off. The SLEEP 1 -SLEEP 4 and SLEEP# 1 -SLEEP# 4 signals can be 
controlled by the configuration data values CDJ-CD4 stored by 
configuration memory cells 431-434, which are best suited for 
disabling the associated blocks at design time. If a block is 
not disabled at design time, this block can be disabled at run 
time by the user control signals UC^-UC,, which may be generated 
by the user logic, or by other means. 

[0042] In accordance with another embodiment of the present 
invention, some blocks have multiple supply voltages. In this 
case all of the supply rails should be power-gated to achieve 
maximum power reduction. In accordance with another embodiment, 
only one switch element may be associated with each block. That 
is, the blocks are power-gated by decoupling the block from only 
one power supply terminal, and not both the V DD and ground supply 
voltage terminals, thereby conserving layout area. 
[0043] The granularity of the power-gated programmable logic 
blocks can range from arbitrarily small circuits to significant 
portions of the PLD. The decision concerning the size of each 
programmable logic block is made by determining the desired 
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trade-off between power savings, layout area overhead of the 
switch elements and the control circuit, and speed penalty. In 
a FPGA, each programmable logic block may be selected to include 
one or more configuration logic blocks (CLBs) , input /output 
blocks (IOBs) , and/or other resources of the FPGA (such as block 
RAM, processors, multipliers, adders, transceivers) . 
[0044] Another way to disable a programmable logic block is 
by scaling down the local supply voltage to the block as low as 
possible, which dramatically reduces the power consumption of 
the block. To scale down the local supply voltage in this 
manner, each independently controlled programmable logic block 
is powered by a separate switching regulator. 

[0045] Fig. 5 is a block diagram of a PLD 500 that implements 
switching regulators in accordance with one embodiment of the 
present invention. Similar elements in Figs. 3 and 5 are 
labeled with similar reference numbers. Thus, PLD 500 includes 
programmable logic blocks 301-304 and V DD voltage supply 305. In 
addition, PLD 500 includes switching regulators 501-504, which f 
are coupled between blocks 3 01-3 04, respectively, and V DD voltage 
supply 3 05. Switching regulators 501-504 are controlled by 
control circuits 511-514, respectively. In the described 
embodiment, switching regulators 501-504 reside on the same chip 
as blocks 301-304. However, in other embodiments, these 
switching regulators can be located external to the chip 
containing blocks 301-304. Switching regulators 501-504 can be 
programmably tuned to provide the desired supply voltages to the 
associated programmable logic blocks 301-304. For example, 
switching regulator 501 can provide a full V DD supply voltage to 
programmable logic block 3 01 when this block is used and active. 
However, switching regulator 501 can further be controlled to 
provide a reduced voltage (e.g., some percentage of the V DD 
supply voltage) to programmable logic block 301 when this block 
is unused or inactive. This reduced voltage may be 
predetermined (by design or via testing) depending on the 
desired circuit behavior. For example, this reduced voltage may 
be the minimum voltage required to maintain the state of the 
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associated blocks. The power consumption of block 301 is 
significantly reduced when the supplied voltage is reduced in 
this manner. 

[0046] Switching regulators 501-504 are controlled in 
response to the configuration data values C^-C^ stored in 
configuration memory cells 511-514, respectively, and the user 
control signals U x -U 4 provided on user control terminals 521-524, 
respectively. A configuration data value (e.g., CJ having an 
activated state will cause the associated switching regulator 
(e.g., switching regulator 501) to provide a reduced voltage to 
the associated programmable logic block (e.g., block 301). 
Similarly, a user control signal (e.g., U 2 ) having an activated 
state will cause the associated switching regulator (e.g., 
switching regulator 502) to provide a reduced voltage to the 
associated programmable logic block (e.g., block 502). A 
configuration data value (e.g., C 3 ) and an associated user 
control signal (e.g., U 3 ) both having have deactivated states 
will cause the associated switching regulator (e.g., switching 
regulator 503) to provide the full V DD supply voltage to the 
associated programmable logic block (e.g., block 503). 
[0047] In accordance with one embodiment, configuration data 
values C x -C 4 may be selected at design time, such that reduced 
voltages are subsequently applied to unused blocks during run 
time. User control signals U x -U 4 may be selected during run 
time, such that reduced voltages are dynamically applied to 
inactive blocks at run time. Techniques for distributing 
multiple programmable down- converted voltages using on-chip 
switching voltage regulators are described in more detail in 
U.S. .Patent Application Serial No. 10/606,619, "Integrated 
Circuit with High- Voltage, Low-Current Power Supply Distribution 
and Methods of Using the Same" by Bernard J. New et al . , which 
is hereby incorporated by reference. 

[0048] In the embodiment of Fig. 5, the granularity of the 
voltage scaled programmable logic blocks 301-304 should be 
fairly large because the overhead associated with switching 

regulators 501-504 is significant. In an FPGA, each 
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programmable logic block 301-304 would most likely be divided 
into several clusters of configuration logic blocks (CLBs) . The 
exact size of each programmable logic block may be determined by 
the desired trade-off among power savings, layout area overhead 
of the switching regulators, and the speed penalty. 
[0049] Fig. 6 is a block diagram of PLD 500, which shows 
switching regulators 501-504 in accordance with one embodiment 
of the present invention. Switching regulators 501-504 include 
control blocks 601-604, respectively, and high-voltage n-channel 
transistors 611-614, respectively. High-voltage n-channel 
transistors 611-614 can tolerate high voltages and may have 
relatively thick gate dielectric layers (e.g., 50 to 60 
Angstroms) and relatively wide channel regions. In some 
embodiments, the gate dielectric thickness of the high- voltage 
n-channel transistors 611-614 is approximately 4 to 6 times 
thicker than the gate dielectric thickness used in the 
programmable logic blocks 301-304. The drain of each of n- 
channel transistors 611-614 is coupled to the V DD voltage supply 
305. The gates of n-channel transistors 611-614 are coupled to 
receive the control voltages V cl -V c4 , respectively, from the 
corresponding control blocks 601-604. The source of each of n- 
channel transistors 611-614 is configured to provide an 
operating voltage V 1 -V 4 , respectively, to programmable logic 
blocks 301-304, respectively. The source of each n-channel 
transistor 611-614 is also coupled to the corresponding control 
block 601-604 in a feedback configuration. 

[0050] Each of n-channel transistors 611-614 forms a power 
switch between the V DD supply voltage 3 05 and the associated 
programmable logic block. Thick oxide n-channel transistors 
611-614 are used to implement the power switches to ensure that 
a high voltage, herein referred to as V^^, can be applied to 
the gates of n-channel transistors 611-614 when the associated 
programmable logic block is active. The high voltage V^s? 
increases the drive current of n-channel transistors 611-614. 
In accordance with one embodiment, the high voltage V B00ST is 
about 2 to 2.5 times greater than V DD . When the high voltage 
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v boost is applied to the gate of one of transistors 611-614, the 
corresponding operating voltage V x -V 4 is pulled up to the full V, 
supply voltage. 

[0051] When a programmable logic block (e.g., programmable 
logic block 301) is inactive, the associated operating voltage 
(e.g., VJ is reduced. The operating voltage applied to the 
associated programmable logic block is preferably selected to be 
high enough to retain data stored in this programmable logic 
block. In one embodiment, the operating voltage is reduced to a 
voltage that is about one half the V DD supply voltage. The 
operating voltage is reduced by applying a low voltage V STANDBY to 
the gate of the corresponding n-channel transistor (e.g., 
transistor 611) . In one embodiment, the low voltage V STANDBY is 
about 80 to 100 percent of the V DD supply voltage. 
[0052] In accordance with one embodiment, each of control 
blocks 601-604 is independently controlled to provide either the 
high voltage V B00ST or the low voltage V STANDBy to the associated n- 
channel transistor 611-614. 

[0053] For example, control block 601 is configured to 
receive the user control signal \J 1 and the configuration data 
value C x , which have been described above. If both the user 
control signal and the configuration data value C ± are 
deactivated, then control block 601 provides a control voltage 
V cl equal to the high voltage to the gate of n-channel 

transistor 611. As a result, an operating voltage V x equal to 
the V DD supply voltage is applied to programmable logic block 
301. 

[0054] However, if either user control signal \J 1 or 
configuration data value C x is activated, then control block 601 
provides a control voltage V cl equal to the low voltage V STANDBy to 
the gate of n-channel transistor 611. As a result, an operating 
voltage V 1 approximately equal to one half the V DD supply voltage 
is applied to programmable logic block 3 01. 

[0055] To ensure that the operating voltage V x applied to 
programmable logic block 301 has a value of V2 V DD when the V STANDBY 
voltage is applied to the gate of transistor 611, the control 
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block 601 may include a feedback mechanism that adjusts the low 
voltage V STANDBy signal until the operating voltage V x is precisely 
equal to Y2 V DD , or any other desired voltage. 
[0056] It is well known that the gate current through a 
transistor typically increases by an order of magnitude for 
every 0.3 Volt increase in the V DD supply voltage. It is 
therefore expected that reducing the operating voltage of a 
programmable logic block by half (V2 V DD ) will reduce the gate 
current through the transistors present in the programmable 
logic block by an order of magnitude or more. At the same time, 
the sub-threshold leakage of these transistors will also 
decrease with the reduced operating voltage. Based on earlier 
generation technology, the leakage current may be reduced by 7 0% 
or more when reducing the operating voltage to V2 V DD . Simulation 
of a ring oscillator shows that the ring oscillator will operate 
properly at the lower operating voltage (V2 V DD ) . It can be 
expected the associated logic block will retain stored data 
using the lower operating voltage. Therefore, the proposed 
switching regulators are capable of achieving more than 70% 
reduction in leakage current without a significant increase in 
area penalty and without sacrificing desired functionality. 
[0057] In accordance with yet another embodiment of the 
present invention, the operating voltages applied to different 
blocks of a PLD are tuned based on application-specific timing 
characteristics to achieve a more power-efficient design 
implementation. Both the hardware architecture necessary to 
enable the tuning and the software flow used to perform the 
tuning are described below. The tuning may be performed at 
design time to optimize resources that have timing slacks, or at 
runtime to exploit periods of low workload. 

[0058] It can be determined at design time, after the place 
and route steps, which parts of the PLD design have timing 
slacks. Programmable logic blocks with timing slacks are faster 
than what is necessary to meet the timing requirements of the 
PLD design. These blocks may be tuned to be slower, such that 
their timing slacks are reduced or eliminated, without 



15 



X-1462-1P US PATENT 

negatively impacting the timing requirements of the overall 
design. The methods by which the programmable logic blocks are 
tuned also lower the power consumption of these blocks, thereby 
achieving a significant power reduction with no timing penalty. 
In essence, tuning the chip in this manner customizes the 
programmable logic device to meet the timing requirements of the 
PLD design, thereby resulting in a more power-efficient design 
mapping . 

[0059] Fig. 7 is a flow diagram 700 illustrating a design 
flow in accordance with the operating voltage tuning embodiment 
of the present invention. Similar steps in flow diagrams 100 
and 700 are labeled with similar reference numbers. Thus, flow 
diagram 700 includes Steps 101-105 of flow diagram 100, which 
are described above. In addition, flow diagram 700 includes the - 
step of performing a timing analysis on the PLD design (Step 
701) . This timing analysis identifies the delays along various 
paths of the PLD design. This timing analysis may be performed • 
after the place and route process has been completed in Step 
103. 

[0060] After the timing analysis is complete, all paths 
having significant timing slacks are identified along with the 
amount of slacks they possess. In one embodiment, the timing 
slacks may be identified by comparing the expected delay of the 
path with the critical delay of the path. That is, the paths 
having timing slacks of N% or more of the critical delay are 
identified (Step 702) . For example, in a synchronous design 
where the critical delay is 10 ns, all paths more than 20% 
faster than the critical delay are identified. As a result, all 
paths with delay of 8 ns or less are identified. These paths 
can all be slowed by at least 2 ns without impacting the timing • 
of the overall design. 

[0061] The minimum operating voltage for each block is then 
determined (Step 703) . In accordance with one embodiment, a 
translation table is used (Step 704), wherein the translation 
table provides a minimum operating voltage in response to a 
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particular timing slack. Note that for larger timing slacks, 
the minimum operating voltage will be lower. 

[0062] Following the identification of paths with significant 
timing slacks, each independently tunable programmable logic 
block in the device is examined to determine the maximum amount 
of acceptable delay increase, which corresponds to the minimum 
timing slack among all of the paths in the programmable logic 
block (Step 703). Then, the minimum timing slack for each 
programmable logic block is converted to a minimum operating 
voltage by performing a lookup operation in a timing/ voltage 
translation table (Step 704) . The timing/voltage translation 
table can be populated via chip testing. The entries in the 
translation table may take the format of "X ns decrease in speed 
requires a supply voltage adjustment by Y Volts". 
[0063] After the minimum operating voltage for each 
programmable logic block has been determined, the configuration 
bit stream is generated in Step 104. The configuration bit 
stream is generated such that the configuration bit stream 
applies the minimum operating voltages as determined in Step 
704. As described in more detail below, the minimum operating 
voltages can be applied by setting the supply voltages to 
various programmable logic blocks of the PLD in response to the 
configuration data bits. The PLD is then configured in response 
to the configuration bit stream (Step 105) . 
[0064] In an enhanced version of the above-described 
embodiment, an initial timing analysis is performed prior to the 
place and route operation (Step 103), based on estimated delays 
of the various paths. The place and route step is then guided 
to group paths with significant timing slacks in to the same 
independently tunable block. 

[0065] Fig. 8 is a block diagram of a PLD 800 in accordance 
with the present embodiment of the invention. PLD 800 includes 4 
programmable logic blocks 801-804, high voltage (V DI> _H) supply 
805, low voltage (V^L) supply 806, control circuit 809 and 
switch elements 851-858. In the described embodiment, switch 
elements 851-858 are implemented by PMOS power-gating 
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transistors. Control circuit 809 is implemented by inverters 
811-814, NOR gates 821-824, configuration memory cells 831-834, 
and user logic input terminals 841-844. The high voltage supply 
805 is configured to provide a full V DD supply voltage, which is 

designated V DD H . The low voltage supply 806 is configured to 

provide a reduced V DD supply voltage, which is designated V DD _L. 
The V DI> _L supply voltage is less than the V DI> _H supply voltage by 
a selected percentage. For example, the V DI> _L supply voltage may 
be 80 percent of the V DD H supply voltage. 

[0066] NOR gates 821-824 and inverters 811-814 are configured 
to generate the high voltage select signals Sel_H 1 -Sel_H 4 and the 
low voltage select signals SelJ^-SelJ^, in response to the 
configuration data values CD^CD^ stored in configuration memory 
cells 831-834, respectively, and the user control signals UC^-UC, 
provided on user logic input terminals 841-844, respectively. 
[0067] For example, NOR gate 821 is coupled to receive 
configuration data value CD 1 from configuration memory cell 831 
and user control signal UC 1 from user logic input terminal 841. 4 
If either the configuration data value CD X or the user control 
signal UC. is activated to a logic high state (indicating that a 
substantial timing slack exists in programmable logic block 801, 
and that the V DD _L voltage supply 806 should be coupled to this 
block 801) , then NOR gate 821 provides a low voltage select 
signal Sel_L x having a logic "0" state. In response, inverter 
811, which is coupled to the output terminal of NOR gate 821, 
provides a high voltage select signal Sel.^ having a logic "1" 
state . 

[0068] The logic "0" Sel_L x signal is applied to the gate of 
PMOS voltage select transistor 852, thereby turning on this 
transistor and coupling programmable logic block 801 to the V DD _JL . 
voltage supply 806. The logic *1* Sel_H x signal is applied to 
the gate of PMOS voltage select transistor 851, thereby turning 
off this transistor and isolating programmable logic block 801 
from the V DI> _H voltage supply 805. As a result, programmable 
logic block 801 operates in response to the V DD _L supply voltage, 
v_ L, thereby minimizing the leakage current in this block. 
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[0069] If both the configuration data value CD 1 and the user 
control signal UC 1 are de-activated to a logic low state, 
(indicating that no substantial timing slack exists in 
programmable logic block 801, and that the V DD __H voltage supply * 
805 should be coupled to this block 801) , then NOR gate 821 
provides a low voltage select signal SelJ^ having a logic "1" 
state. In response, inverter 811 provides a high voltage select 
signal Sel^^ having a logic "0" state. 

[0070] The logic "0" SelJE^ signal is applied to the gate of 
PMOS voltage select transistor 851, thereby turning on this 
transistor and coupling programmable logic block 801 to the V DD _H 
voltage supply 805. The logic "1" Sel^^ signal is applied to 
the gate of PMOS voltage select transistor 852, thereby turning 
off this transistor and isolating programmable logic block 801 
from the V DD _L voltage supply 806. As a result, programmable 
logic block 801 operates in response to the V DD _H supply voltage, 
thereby enabling this block to operate at the required speed. 

[0071] Programmable logic block 802 is coupled to the V DD H 

voltage supply 805 or the V DV _L voltage supply 806 in response to 
configuration data value CD 2 and user control signal UC 2 , in the 
same manner as block 801. Similarly, programmable logic block 
803 is coupled to the V DD _H voltage supply 805 or the V DD __L 
voltage supply 806 in response to configuration data value CD 3 
and user control signal UC 3 , in the same manner as block 801. 
Programmable logic block 804 is coupled to the V DEH _H voltage 
supply 805 or the V DD _L voltage supply 806 in response to 
configuration data value CD 4 and user control signal UC 4 , in the 
same manner as block 801. 

[0072] The SelJ^-SelJ^ and SelJ^-SelJ^ signals can be 
controlled by the configuration data values 00^004 stored by 
configuration memory cells 831-834, which are best suited for 
coupling the associated blocks to the V DD _L voltage supply 806 at 
design time. If a block is not coupled to the V DD _L voltage 
supply 806 at design time, this block can be coupled to the V DD _U 
voltage supply 806 at run time by the user control signals UC^- 
UC 4 , which may be generated by the user logic. 
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[0073] In accordance with another embodiment, a tunable 
programmable logic device is implemented by enabling local 
supply voltage scaling. In this scheme, each independently 
tunable programmable logic block is powered by a separate 
variable-voltage switching regulator. The programmable logic 
blocks are tuned by configuring the regulators to adjust the 
operating voltages applied to the programmable logic blocks. 
When the operating voltage of a programmable logic block is 
scaled down, the block becomes slower, and the dynamic and 
static power consumed by the block are dramatically reduced. 
[0074] Fig. 9 is a block diagram of a PLD 900 that implements 
variable voltage switching regulators in accordance with the 
present embodiment of the invention. PLD 900 includes 
programmable logic blocks 901-904, V DD voltage supply 905, 
configuration memory cell sets 911-914, user control terminal 
sets 921-924, and variable voltage switching regulators 931-934. 
Voltage regulators 931-934 are configured to provide operating 
voltages to programmable logic blocks 901-904, respectively, in 
response to the V DD supply voltage. Each of voltage regulators 
931-934 independently may select one of two or more possible 
operating voltages in response to the configuration data bits 
stored in configuration memory cell sets 911-914, respectively. 
For example, if configuration memory cell set 911 includes two 
configuration memory cells (N=2), then voltage regulator 931 may 
provide operating voltages equal to V DD , 0.95V DD , 0 . 9V DD or 0.85 V DD 
in response to the configuration data bits stored in 
configuration memory cell set 911. Other numbers of 
configuration memory cells and other operating voltages can be 
provided in other embodiments . 

[0075] It is further possible to tune PLD 900 dynamically 
(during runtime) to exploit variations in the application's 
workload or performance requirements. Many user designs go 
through periods of low workload, during which the affected 
blocks may be tuned to lower speed and lower power. The tuning 
is preferably initiated by the user design, since the user has 
the best knowledge of when extended periods of low workload will 
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occur. One way to enable dynamic scaling of local voltages is 
through dynamic reconfiguration of the programmable regulators 
using techniques described by Brandon J. Blodget et al . , 
''Reconfiguration of a Programmable Logic Device Using Internal 
Control," U.S. Patent Application Serial No. 10/377,857. 
[0076] In accordance with one embodiment, the user may 
implement such an adjustment by varying the N signals provided 
on the user control terminal set 921, or by rewriting the 
desired configuration memory bits into configuration memory cell 
set 911. The other variable voltage switching regulators 932- 
934 are controlled in the same manner as voltage regulator 931. 
[0077] In the described embodiment, variable voltage 
switching regulators 931-934 reside on the same chip as 
programmable logic blocks 901-904. However, in other 
embodiments, these voltage regulators 931-934 can be located 
external to the chip containing blocks 901-904. 
[0078] Moreover, although the examples illustrate a PLD 
divided into four blocks, it should be understood that the PLD 
can be divided into arbitrary number of blocks, and each block 
can be of arbitrary granularity. In the embodiment of Fig. 9, 
the granularity of the voltage scaled programmable logic blocks 
901-904 should be fairly large because the overhead associated 
with variable voltage switching regulators 931-934 is 
significant. In an FPGA, each programmable logic block 901-904 
would most likely be divided into several clusters of 
configuration logic blocks (CLBs) . The exact size of each 
programmable logic block is determined by the desired trade-off 
between power savings and the layout area overhead of the 
switching regulators. Techniques for distributing multiple 
programmable voltages by using on-chip switching voltage 
regulators are described by Bernard J. New et al., in 
" Integrated Circuit With High-Voltage, Low-Current Power Supply 
Distribution And Methods Of Using The Same," US Patent 
Application Serial No. 10/606,619. 

[0079] Communication across programmable logic blocks having 
different operating voltages does not require special attention 
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if the voltage difference is relatively small. However, when 
signals propagate from a low voltage block to a high voltage 
block, even small voltage differences can lead to significant DC 
current leakage in the high-voltage block due to transistors 
that are not completely turned off. To eliminate such DC 
current leakage, and to facilitate communication across two 
blocks of arbitrarily different voltages, level-shifters should 
be used as interfacing logic. To reduce area and speed 
overhead, level-shifters can be integrated into flip-flops, 
which are typically present on the programmable logic device. 
[0080] Fig. 10 is a circuit diagram of a level-shifting flip- 
flop 1000, for use in accordance with one embodiment of the 
present invention. Flip-flop 1000 includes inverters 1001-1004, 
complementary pass gates 1011-1012, p-channel transistors 1021- 
1022, and n-channel transistors 1031-1034. Inverters 1001-1003 . 
and complementary pass gates 1011-1012 operate in response to 
the V DD _L supply voltage. When the CLK# signal is high (CLK is 
low) / inverter 1001 is enabled to route the inverse of the input 
data value D to inverter 1002. Note that the input data value D 
is defined at the V DI> __L voltage level. Inverter 1003 and 
complementary pass gates 1011-1012 are disabled by the low CLK 
signal at this time. 

[0081] When the CLK signal transitions to a logic high state 
(CLK# is low) , inverter 1001 is disabled and inverter 1003 is 
enabled, thereby allowing the data value D to be latched into 
cross-coupled inverters 1002-1003. The logic low CLK# signal 
disables n-channel transistors 1033 and 1034. The high CLK 
signal also enables complementary pass gates 1011-1012, thereby 
applying the data value D and the inverse data value D# to the 
gates of n-channel transistors 1031 and 1032, respectively. As ♦ 
a result, the data value D or the inverse data value D# turns on 
one of n-channel transistors 1031 or 1032. For example, if the 
data value D has a logic low state, then n-channel transistor 
1031 is turned off and n-channel transistor 1032 is turned on. 
Turned on transistor 1032 pulls down the gate voltage of p- 
channel transistor 1021 to ground, thereby turning on this 
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transistor 1021. Turned on transistor 1021 applies the V DD _H 
voltage to the gate of p-channel transistor 1022 (thereby 
turning this transistor off), and to the input terminal of 
inverter 1004 (which provides a logic low Q output signal) . 
Note that the Q output signal has been translated to the V DD __H 
voltage level. When the CLK signal transitions to a logic low 
state (CLK# is high) , the n-channel transistors 1033-1034 turn 
on, thereby latching the data value D until the next rising edge 
of the CLK signal. Level-shifting flip flop 1000 is described 
in more detail by M. Takahashi et al., "A 60mW MPEG4 Video Codec 
using Clustered Voltage Scaling with Variable Supply- Voltage 
Scheme," Journal of Solid State Circuits, vol. 33, no. 11, pp. 
1772-1780, Nov. 1998. 

[0082] Although the invention has been described in 
connection with several embodiments, it is understood that this 
invention is not limited to the embodiments disclosed, but is 
capable of various modifications, which would be apparent to a 
person skilled in the art. For example, although the described 
embodiments included four programmable logic blocks, it is 
understood that other numbers of blocks can be used in other 
embodiments. Thus, the invention is limited only by the 
following claims. 
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